Nerstrand: Fast Multi-Threaded Graph Clustering

نویسندگان

  • Dominique LaSalle
  • George Karypis
چکیده

In this work we apply the multilevel paradigm to optimizing the modularity of a graph clustering on parallel shared memory architectures. We improve upon the state of the art by introducing new methods for effectively and efficiently coarsening graphs with power-law degree distributions, detecting an unknown number of communities, and for performing greedy modularity refinement in parallel. Finally, we present the culmination of this research, the graph clustering tool Nerstrand. In serial mode, Nerstrand runs in a fraction of the time of current methods and produces results of equal quality. When run with multiple threads, Nerstrand exhibits significant speedup without decreasing quality. Nerstrand works well on large graphs, clustering a graph with over 18 million vertices and 261 million edges in 18.3 seconds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-threaded modularity based graph clustering using the multilevel paradigm

Graphs are an important tool for modeling data in many diverse domains. Recent increases in sensor technology and deployment, the adoption of online services, and the scale of VLSI circuits has caused the size of these graphs to skyrocket. Finding clusters of highly connected vertices within these graphs is a critical part of their analysis. In this paper we apply the multilevel paradigm to the...

متن کامل

Multi-threading and clustering for scene graph systems

The support for multi-threaded applications in current scene graphs is very limited, if it is supported at all. This work presents an approach for a very general multi-threading framework that allows total separation of threads without total replication of data. It also supports extensions to clusters, both for sort-first and sort-last rendering configurations. The described concepts have been ...

متن کامل

An open source C++ implementation of multi-threaded Gaussian mixture models, k-means and expectation maximisation

Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded ...

متن کامل

Large-Scale Graph-based Transductive Inference

We consider the issue of scalability of graph-based semi-supervised learning (SSL) algorithms. In this context, we propose a fast graph node ordering algorithm that improves (parallel) spatial locality by being cache cognizant. This approach allows for a near linear speedup on a shared-memory parallel machine to be achievable, and thus means that graph-based SSL can scale to very large data set...

متن کامل

Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture

Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration. Among emerging killer applications, parallel graph processing has been a critical technique to analyze connected data. In this paper, we empirically evaluate various computing platforms including an Intel Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 processor codenamed Knights ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013